AITopics | maximum step

Collaborating Authors

maximum step

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

0b96d81f0494fde5428c7aea243c9157-Supplemental.pdf

Neural Information Processing SystemsFeb-18-2026, 22:30:24 GMT

component description observation state index, maximum step, tabular sgd, (11 more...)

Neural Information Processing Systems

Country: North America > Canada (0.04)

Industry: Leisure & Entertainment (0.32)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback

Supplementary Material: Discovering Reinforcement Learning Algorithms Junhyuk Oh Matteo Hessel Wojciech M. Czarnecki Zhongwen Xu Hado van Hasselt Satinder Singh David Silver DeepMind

Neural Information Processing SystemsOct-2-2025, 00:37:31 GMT

In tabular grid worlds, object locations are randomised across lifetimes but fixed within a lifetime. There are two different action spaces. The other version has only 9 movement actions. The episode terminates after a fixed number of steps (i.e., chain length), which is There is no state aliasing because all states are distinct. We trained LPGs by simulating 960 parallel lifetimes (i.e., batch size for meta-gradients), each of Rectified linear unit (ReLU) was used as activation function throughout the experiment.

component description observation state index, maximum step, number, (8 more...)

Neural Information Processing Systems

Country: North America > Canada (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

Modeling Multi-Step Scientific Processes with Graph Transformer Networks

Volk, Amanda A., Epps, Robert W., Ethier, Jeffrey G., Baldwin, Luke A.

arXiv.org Artificial IntelligenceAug-10-2024

This work presents the use of graph learning for the prediction of multi-step experimental outcomes for applications across experimental research, including material science, chemistry, and biology. The viability of geometric learning for regression tasks was benchmarked against a collection of linear models through a combination of simulated and real-world data training studies. First, a selection of five arbitrarily designed multi-step surrogate functions were developed to reflect various features commonly found within experimental processes. A graph transformer network outperformed all tested linear models in scenarios that featured hidden interactions between process steps and sequence dependent features, while retaining equivalent performance in sequence agnostic scenarios. Then, a similar comparison was applied to real-world literature data on algorithm guided colloidal atomic layer deposition. Using the complete reaction sequence as training data, the graph neural network outperformed all linear models in predicting the three spectral properties for most training set sizes. Further implementation of graph neural networks and geometric representation of scientific processes for the prediction of experiment outcomes could lead to algorithm driven navigation of higher dimension parameter spaces and efficient exploration of more dynamic systems.

linear model, numerical feature, surrogate, (16 more...)

arXiv.org Artificial Intelligence

2408.05425

Country: North America > United States > Colorado > Jefferson County > Golden (0.04)

Genre:

Workflow (1.00)
Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Zero-shot Imitation Policy via Search in Demonstration Dataset

Malato, Federco, Leopold, Florian, Melnik, Andrew, Hautamaki, Ville

arXiv.org Artificial IntelligenceJan-29-2024

Behavioral cloning uses a dataset of demonstrations to learn a policy. To overcome computationally expensive training procedures and address the policy adaptation problem, we propose to use latent spaces of pre-trained foundation models to index a demonstration dataset, instantly access similar relevant experiences, and copy behavior from these situations. Actions from a selected similar situation can be performed by the agent until representations of the agent's current situation and the selected experience diverge in the latent space. Thus, we formulate our control problem as a dynamic search problem over a dataset of experts' demonstrations. We test our approach on BASALT MineRL-dataset in the latent representation of a Video Pre-Training model. We compare our model to state-of-the-art, Imitation Learning-based Minecraft agents. Our approach can effectively recover meaningful demonstrations and show human-like behavior of an agent in the Minecraft environment in a wide variety of scenarios. Experimental results reveal that performance of our search-based approach clearly wins in terms of accuracy and perceptual evaluation over learning-based models.

agent, latent space, trajectory, (15 more...)

arXiv.org Artificial Intelligence

2401.16398

Country:

Europe > Finland (0.05)
North America > United States > California > San Francisco County > San Francisco (0.04)
Europe > Germany (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games > Computer Games (0.88)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
(3 more...)

Add feedback

Discovering Reinforcement Learning Algorithms

Oh, Junhyuk, Hessel, Matteo, Czarnecki, Wojciech M., Xu, Zhongwen, van Hasselt, Hado, Singh, Satinder, Silver, David

arXiv.org Artificial IntelligenceJul-17-2020

Reinforcement learning (RL) algorithms update an agent's parameters according to one of several possible rules, discovered manually through years of research. Automating the discovery of update rules from data could lead to more efficient algorithms, or algorithms that are better adapted to specific environments. Although there have been prior attempts at addressing this significant scientific challenge, it remains an open question whether it is feasible to discover alternatives to fundamental concepts of RL such as value functions and temporal-difference learning. This paper introduces a new meta-learning approach that discovers an entire update rule which includes both 'what to predict' (e.g. value functions) and 'how to learn from it' (e.g. bootstrapping) by interacting with a set of environments. The output of this method is an RL algorithm that we call Learned Policy Gradient (LPG). Empirical results show that our method discovers its own alternative to the concept of value functions. Furthermore it discovers a bootstrapping mechanism to maintain and use its predictions. Surprisingly, when trained solely on toy environments, LPG generalises effectively to complex Atari games and achieves non-trivial performance. This shows the potential to discover general RL algorithms from data.

algorithm, prediction, update rule, (14 more...)

arXiv.org Artificial Intelligence

2007.08794

Country: Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Leisure & Entertainment > Games > Computer Games (0.57)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Constant Step Size Least-Mean-Square: Bias-Variance Trade-offs and Optimal Sampling Distributions

Défossez, Alexandre, Bach, Francis

arXiv.org Machine LearningNov-29-2014

We consider the least-squares regression problem and provide a detailed asymptotic analysis of the performance of averaged constant-step-size stochastic gradient descent (a.k.a. least-mean-squares). In the strongly-convex case, we provide an asymptotic expansion up to explicit exponentially decaying terms. Our analysis leads to new insights into stochastic approximation algorithms: (a) it gives a tighter bound on the allowed step-size; (b) the generalization error may be divided into a variance term which is decaying as O(1/n), independently of the step-size $\gamma$, and a bias term that decays as O(1/$\gamma$ 2 n 2); (c) when allowing non-uniform sampling, the choice of a good sampling density depends on whether the variance or bias terms dominate. In particular, when the variance term dominates, optimal sampling densities do not lead to much gain, while when the bias term dominates, we can choose larger step-sizes that leads to significant improvements.

artificial intelligence, machine learning, variance term, (17 more...)

arXiv.org Machine Learning

1412.0156

Country: Europe > France (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.57)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.36)

Add feedback